tiny-tts

Ultra-lightweight English text-to-speech model (1.6M params, ~3.4MB ONNX)

These details have not been verified by PyPI

Project links

Project description

TinyTTS

Ultra-lightweight English Text-to-Speech — only 1.6M parameters, ~3.4 MB ONNX

Highlights

TinyTTS is an end-to-end text-to-speech model that delivers natural-sounding speech with a fraction of the resources required by conventional TTS systems.

Metric	TinyTTS	Typical TTS Models
Parameters	~1.6M	50M–200M+
Checkpoint size	~3.4 MB (ONNX FP16)	200 MB–1 GB+
Sample rate	44.1 kHz	22.05–44.1 kHz
End-to-end	Yes	Often requires separate vocoder

With only 1.6 million parameters and an ONNX model of just ~3.4 MB (FP16), TinyTTS runs comfortably on CPU-only machines, edge devices, and embedded systems — making real-time speech synthesis accessible without a GPU.

Installation

From source (pip install)

git clone https://github.com/tronghieuit/tiny-tts.git
cd tiny-tts
pip install -e .

After installing, the tiny-tts command is available globally:

tiny-tts --checkpoint G.pth --text "Hello world" --device cuda

Dependencies only

pip install torch torchaudio soundfile g2p-en transformers numba

Quick Start

Basic inference

tiny-tts \
  --text "The weather is nice today, and I feel very relaxed." \
  --checkpoint G.pth \
  --output output.wav \
  --speaker MALE \
  --speed 1.0 \
  --device cuda

CPU inference

tiny-tts \
  --text "The weather is nice today, and I feel very relaxed." \
  --checkpoint G.pth \
  --device cpu

Output files are saved to infer_outputs/.

Python API

You can easily use TinyTTS directly in your Python code:

from tiny_tts import TinyTTS

# Initialize the TTS model (auto-detects device and downloads default checkpoint if missing)
tts = TinyTTS()
# OR specify a custom checkpoint: tts = TinyTTS(checkpoint_path="...")

# Synthesize a single sentence
tts.speak("Hello, this is a test of the Python API.", output_path="hello.wav")

# Adjust speech speed (1.0=normal, 1.5=faster, 0.7=slower)
tts.speak("This is faster speech.", output_path="fast.wav", speed=1.5)
tts.speak("This is slower speech.", output_path="slow.wav", speed=0.7)

# Synthesize a long paragraph (5 sentences)
paragraph = (
    "TinyTTS is an ultra-lightweight text-to-speech model. "
    "It has only one point six million parameters, which makes it extremely fast. "
    "You can run it easily on your local CPU without a dedicated graphics card. "
    "The audio quality remains surprisingly clear despite the small model size. "
    "I hope you enjoy building exciting applications with it!"
)
tts.speak(paragraph, output_path="paragraph.wav")

Inference Benchmarks

Benchmarked on real hardware with the sentence:
"The weather is nice today, and I feel very relaxed." (~4.9s of audio at 44.1kHz)

CPU: Intel Core (laptop, no GPU)
PyTorch: 2.5.1+cu121
Model: 1.62M parameters

Backend	Synthesis Time	Audio	RTFx
ONNX Runtime (CPU)	92 ms	4.88s	~53x 🚀
PyTorch (CPU)	272 ms	4.88s	~18x

RTFx = Audio Duration ÷ Synthesis Time (higher = faster).
With only 1.62M params, TinyTTS synthesizes ~5s of 44.1kHz audio in 92ms via ONNX — approximately 53× real-time on a laptop CPU.

Comparison with Other TTS Engines

All numbers are CPU-only inference benchmarked on the same machine (Intel Core laptop, no GPU).
Text: "The weather is nice today, and I feel very relaxed."
Protocol: 5 warm-up runs + 20 timed runs (median). Model load time excluded.

ENGINE	Params	TTFA (ms)	TOTAL (s)	AUDIO (s)	RTFx	🔊
TinyTTS (ONNX)	1.6M	86	0.092	4.88	~53x 🚀	▶
Piper (ONNX, 22kHz)	~63M	114	0.112	2.91	~26x	▶
TinyTTS (PyTorch)	1.6M	295	0.272	4.88	~18x	▶
KittenTTS nano	~10M	298	0.286	4.87	~17x	▶
Supertonic (2-step)	~82M	260	0.249	3.69	~15x	▶
Pocket-TTS	100M	1055	0.928	3.68	~4x	▶
Kokoro ONNX	82M	943	0.933	3.16	~3x	▶
KittenTTS mini	~25M	1965	2.047	4.17	~2x	▶

TTFA = Time To First Audio. RTFx = Audio Duration ÷ Synthesis Time (higher = faster).
⚠️ Output sample rates differ: Piper 22kHz, KittenTTS 24kHz, TinyTTS/Supertonic 44.1kHz.
TinyTTS achieves the best speed-to-size ratio: only 1.6M params / 3.4 MB ONNX yet ~53× real-time at 44.1kHz.

CPU vs GPU vs ONNX Summary

Backend          | Synthesis Time | Audio  | RTFx
-----------------|----------------|--------|----------
CPU (ONNX)       | 0.092 s        | 4.88s  | ~53x 🚀
CPU (PyTorch)    | 0.272 s        | 4.88s  | ~18x
GPU (CUDA, est.) | ~0.015 s       | 4.88s  | ~325x

ONNX Runtime is the recommended backend for CPU deployment — it provides ~3× speedup over PyTorch eager mode by fusing ops and eliminating Python dispatch overhead.

Run benchmarks yourself

python benchmark.py

Compares TinyTTS (PyTorch + ONNX) against Piper, Kokoro, KittenTTS, Pocket-TTS and Supertonic on CPU.

CLI Arguments

Argument	Short	Default	Description
`--text`	`-t`	"The weather is nice today..."	Text to synthesize
`--checkpoint`	`-c`	(optional)	Path to `G.pth`. Auto-downloads if omitted.
`--output`	`-o`	`output.wav`	Output audio filename
`--speaker`	`-s`	`MALE`	Speaker ID
`--speed`		`1.0`	Speech speed (1.0=normal, 1.5=faster, 0.7=slower)
`--device`		`cuda`	Device: `cuda` or `cpu`

Project Structure

tiny-tts/
├── infer.py                  # Main inference script
├── TinyTTS.png               # Project logo
├── setup.py                  # Package setup (pip install)
├── pyproject.toml            # Build configuration
├── G.pth              # Pre-trained checkpoint (FP16: ~17 MB)
├── tinytts_fp16.onnx         # ONNX FP16 model (~3.4 MB)
├── models/
│   └── synthesizer.py        # Model definition
├── nn/
│   ├── attentions.py         # Attention layers
│   ├── modules.py            # Neural network modules
│   ├── commons.py            # Utility functions
│   └── transforms.py         # Flow transforms
├── text/
│   ├── english.py            # English G2P pipeline
│   ├── symbols.py            # Phoneme symbol tables
│   ├── cmudict.rep           # CMU Pronouncing Dictionary
│   └── english_utils/        # Text normalization
├── alignment/
│   └── core.py               # Viterbi alignment
└── utils/
    └── config.py             # Model hyperparameters

TODO

Public source code for training
Add more English speakers
Add ultra-lightweight zero-shot voice cloning
Release an even smaller model version while maintaining high accuracy

License

Licensed under the Apache License, Version 2.0.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.2

Apr 8, 2026

0.3.1

Apr 8, 2026

0.3.0

Apr 8, 2026

0.2.2

Apr 7, 2026

0.2.1

Apr 7, 2026

0.2.0

Apr 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tiny_tts-0.3.2-py3-none-any.whl (2.1 MB view details)

Uploaded Apr 8, 2026 Python 3

File details

Details for the file tiny_tts-0.3.2-py3-none-any.whl.

File metadata

Download URL: tiny_tts-0.3.2-py3-none-any.whl
Upload date: Apr 8, 2026
Size: 2.1 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.1

File hashes

Hashes for tiny_tts-0.3.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`efa1e822c9f0eea9e1a1c567d82b891d979e5365877518c27f2c554acf5a4f1f`
MD5	`70e9eaab2d61bdca7db5490803e5bff2`
BLAKE2b-256	`f764ccfe0db57f18bc0059a9fd42ed2e4f174aca26af5b185bf79ec86e3210ea`

See more details on using hashes here.

tiny-tts 0.3.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

TinyTTS

Highlights

Installation

From source (pip install)

Dependencies only

Quick Start

Basic inference

CPU inference

Python API

Inference Benchmarks

Comparison with Other TTS Engines

CPU vs GPU vs ONNX Summary

Run benchmarks yourself

CLI Arguments

Project Structure

TODO

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes